Non-linear evolution of f{R) cosmologies I: methodology 
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We introduce the method and the implementation of a cosmological simulation of a class of metric- 
variation f{R) models that accelerate the cosmological expansion without a cosmological constant 
and evade solar-system bounds of small-field deviations to general relativity. Such simulations are 
shown to reduce to solving a non-linear Poisson equation for the scalar degree of freedom introduced 
by the f{R) modifications. We detail the method to efficiently solve the non-linear Poisson equation 
by using a Newton-Gauss-Seidel relax;ation scheme coupled with multigrid method to accelerate 
the convergence. The simulations are shown to satisfy tests comparing the simulated outcome to 
analytical solutions for simple situations, and the dynamics of the simulations are tested with orbital 
and Zeldovich collapse tests. Finally, we present several static and dynamical simulations using 
realistic cosmological parameters to highlight the differences between standard physics and f{R) 
physics. In general, we find that the f{R) modifications result in stronger gravitational attraction 
that enhances the dark matter power spectrum by ~ 20% for large but observationally allowed 
f{R) modifications. More detailed study of the non-linear f{R) effects on the power spectrum are 
presented in a companion paper. 



I. INTRODUCTION 

Explaining the accelerated cosmological expansion has 
become one of the greatest challenges in modern day 
physics [see e.g.,[l|. Of the many potential explanations, 
two classes are popular today: ones that propose new and 
unseen form of energy with negative pressure, and ones 
that modify the Einstein-Hilbert action. Of particular in- 
terest in recent literature are a class of modifications to 
the Einstein action by an addition of non-linear functions 
of the Ricci scalar, R, which have been demonstrated to 
cause accelerated expansion for a range of f{R) functions 
iii. 

In addition to explaining the cosmic acceleration, such 
f{R) modifications must satisfy stringent constraints im- 
posed by solar system tests 0,13 SQ as well as cosmolog- 
ical constraints imposed by recent dark energy measure- 
ments [3, [ifl [ill- Recently, a new class of f{R) models 
was proposed by Hu & Sawicki [ij, hereafter HS model] 
that has the flexibility to reproduce the observed cos- 
mological acceleration as well as to satisfy solar system 
bounds on deviations from the Newtonian limits of gen- 
eral relativity. In the proposed model, the scalar degree 
of freedom introduced by the f{R) modification becomes 
massive in regions of high curvature and is hidden by the 
so-called chameleon mechanism [l^ [ij] . 

In such models that evade both the cosmological back- 
ground test and solar system tests, the next strongest 
constraint comes from the effects of the f{R) modifica- 
tions to the intermediate lengths scales, in the regime 
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of cosmological large scale structures. Outside of dense 
dark matter halos, the light scalar degree of freedom 
produces a long range fifth force that enhances gravi- 
tational attraction and thereby potentially change clus- 
tering of matter. Such effects were studied in the lin- 
ear regime [iJ, [l5|, |la|, in which large deviations of 
dark matter power spectrum were obtained even for 
small f{R) modifications. Furthermore, an attempt to 
model some of the non-linear chameleon physics using a 
parametrized framework on top of a halo model are pre- 
sented in [17| . However, those studies remain inconclu- 
sive because the effects of non-linear growth of structure 
and the chameleon mechanism (which requires non-linear 
growth) are not included in the analyses. For such non- 
linear study, a realistic N-body simulation of structure 
formation in f{R) cosmologies is needed. 

In this paper, we introduce a formalism and an imple- 
mentation of a cosmological N-body simulation with fully 
non-linear treatment of f{R) modifications. Although 
somewhat limited in the attainable dynamic range, we 
show that our implementation can capture the essential 
non-linear features of the HS model that have large im- 
pact on the cosmological large scale structure formation. 
In a companion paper, we apply our methodology to re- 
alistic cosmological simulations and discuss the implica- 
tions of the f{R) model on the large scale structure of 
the Universe. 

This paper is organized as follows. In i }IIl we review 
the general properties of /(i?) cosmology and give de- 
tails on the particular model of ,f{R) that we choose. 
The numerical algorithms to solve for the f{R) cosmo- 
logical large scale structure formation are developed in 
^III[ and are tested in mV\ Several test cases under the 
HS model, highlighting the different aspects of the f{R) 
model, are presented in ifVl In addition, a brief discussion 



of full cosmological simulations with f{R) modifications 
are presented in ^Vj a more complete discussion will be 
presented in a companion paper. Finally, conclusions and 
discussions of possible future extension of our formalism, 
are given in JJVII 



II. f{R) COSMOLOGY 

The so-called f{R) theories are modifications to gen- 
eral relativity (GR) given by the following action: 



S 



(1) 



where R is the scalar curvature (i.e., the Ricci scalar), 
f{R) is an arbitrary function of R, k^ = 8ttG, and Lm is 
the Lagrangian of the ordinary matter. Setting f{R) = 
recovers ordinary GR without a cosmological constant. 
We follow the standard procedure of setting c = 1 and 
h = I throughout the paper unless otherwise noted. The 
variation of the action with respect to the metric results 
in the field equation for gap, 



Gap + fnRaP — [-^ - Ofn ) gap 



-^oc^pfR = I^^Tap 



(2) 



- df{B) 



where fn = L. , Gap is the unmodified Einstein tensor, 
and Tap is the energy-momentum tensor. We work with 
the standard assumption that the Universe is filled with 
a perfect fluid of the form 



TaP^ip + P)UaUp + Pgat3, 



(3) 



where p and P are the fluid energy density and pres- 
sure and U is the four velocity of the fluid. Because we 
are only concerned with modeling the effects of late time 
accelerated expansion, we neglect the relativistic energy 
component and thus set P = 0. For the background 
Friedmann- Robertson- Walker metric, the modifled Ein- 
stein equation becomes the modified Friedmann equation 



where ci , C2 and n are free dimensionless parameters and 
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(7) 



This form of f{R) can be tuned to match the ACDM ex- 
pansion history while not introducing a true cosmological 
constant. In addition, the above form has the potential 
to evade solar system tests via the so-called chameleon 
mechanism [13l 114] , thus remaining viable even with the 
stringent constraints placed by such tests. However, the 
HS model has interesting deviations from ACDM in the 
intermediate scales that result in distinct signatures in 
large scale structure formation. Such deviations have 
been studied in the linear regime [la, [la, \lMi but the 
lack of insights into the modification's behavior in the 
non-linear regime places a strong limitation on the appli- 
cability of comparison of such studies to observations. 

The free parameters, ci, C2, and n, are chosen so 
that the resulting background expansion history closely 
matches to that of ACDM. This requirement is equiva- 
lent to requiring the background fa field strength today, 
Jro, to be sufficiently small {fno '^ 1) or, alternatively, 
Rq ^ m. In this parameter regime, the field is always 
near the minimum of the effective potential, 

R = 87rGpM-2/«87rGpM + 2— m^. (8) 

C2 



In the limit of ACDM, we have 
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(9) 



and thus, to approximate the ACDM expansion history, 
we set 



ci „0a 

C2 iiM 



(10) 



Two free parameters remain, namely n and ci/c^, that 
control how closely the f{R) expansion history matches 
that of ACDM. From the high curvature hmit of f{R), 



H' h{HH' + H') + { + H'IrrR' = ^, 
6 3 



(4) 



where H is the Hubble parameter, //jij is the second 
derivative of / with respect to R, and ' denote derivatives 
with respect to In a. The over-bars signify that the quan- 
tities are defined at the cosmological background level. 
The trace of Eqn. ^ becomes a dynamical equation for 
Ir- 



?.UfR-R + fRR-2f = -i^p. 



(5) 



Recently, a phenomenologically viable model of f{R) 
was proposed in [12 . [iTj , in which 
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we see that larger ci/c| results in closer match to ACDM 
(i.e., f{R) approaches a constant as ci/c^ -^ 0) and 
smaller n mimics ACDM until later in the expansion his- 
tory. It is also important to note that the condition 
.Ro 3> m^ requires that R ^ m^ at the background 
level throughout the expansion history because _R is a 
monotonically decreasing function of time. Thus, in our 
numerical simulations, we are able to simplify the field 
equations given the high curvature assumption. 

Phenomenologically viable f{R) models require that 
IIroI be small so as to not significantly alter the cosmic 
expansion history, limiting the field strength to | /^q I ~ 1 
with current observations [l^ and to |/_ro| ~ 0.01 with 



future weak Icnsing surveys [T2|- An even stronger con- 
straint can be placed on the field by considering the dy- 
namics of our galaxy, resulting in an upper bound of 
l/flol ~ 10~^ [I4I. However, such analyses do not self- 
consistently model large scale clustering that potentially 
relaxes the tight upper bound. 

In the high curvature regime, our f{R) model can be 
expanded as, 
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(12) 



are interpolated onto a regular grid and t he p otential is 
solved only at the grid points [see, e.g., [l^, [201 • The 
choice to use a field representation of the potential in- 
stead of particle pair representation (as in Tree codes) 
is motivated by the fact that the change in the inter- 
particle forces are mediated by an auxiliary scalar field, 
namely /fl. The force law between a pair of particles in 
the presence of fu depends on the entire behavior of the 
field between the two particles and thus cannot be ex- 
pressed as a simple function of their separation. In this 
section, we outline the method we developed to obtain (j) 
by numerically solving equations (|15p and HI 



valid when R ^ m? . Furthermore, the field equation ([5]) 
can be simplified by neglecting the small fnR term, thus 
resulting in 



3n/i,-i?-2/ = -^V 



(13) 



Assuming the Friedmann-Robertson- Walker (FRW) met- 
ric, we subtract the spatially constant background quan- 
tities and thus the field equation becomes an equation 
for the perturbations. 



A. Code Units 

The code units are based on the convention used in 
[m, [Hi, where 



^■\ r ~ t ^ V 

r — a — , t— — , p = a — , 
ro to vo 

^ ~ 3 P B 3 ^ 

)= — , p = a —, R = a -—, 
9o Po Ro 



306 fn -6R= -n^Sp, 



(14) and 



where SR = R-R=R- Racdm, SJr = Jr - fR{R), 
5p = p— p, and the term proportional to 5f w /rSR has 
been neglected because Jr-^I. Finally, we work in the 
quasi-static limit in which V//j ^ dfji/dt, and the field 
equation is reduced to 



1 



VVi? = o [SRUr) - SnGSp] 



(15) 



The fn field is coupled to the gravitational potential, 
via 



-^—Sp- -5R{fR), 



(16) 



for which the derivation can be found in [12| . Equations 
P^ and (Uni) define a system of equations for the grav- 
itational potential, the solution to which can be used to 
evolve matter particles. Note that in the limit of ordinary 
GR, 5R = 8ttG6p, and the equation for the gravitational 
potential reduces to the unmodified equation, 



V^(t) ^ AnGSp. 



(17) 



The strategy that we use to simulate f{R) cosmology 
boils down to solving the two field equations, Eg. 1151 and 
1161 and using the resulting peculiar potential to update 
the dark matter density. 



III. NUMERICAL METHODS 



ibo 



(18) 

(19) 
(20) 

(21) 
(22) 

(23) 



In the above definitions, Lbox is the comoving simulation 
box size in h~^ Mpc, A^g is the number of grid cells in 
each direction, Hq is the Hubble parameter today, pc,o is 
the critical density today, and r^M.o is the fraction of non- 
relativistic matter today relative to the critical density. 

Before converting the field equations to code units, we 
restore the factors of c in Eqn (J15p , which results in 
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(24) 



Substituting all physical quantities and derivatives in 
Eqn ([24| by their code unit equivalent yields 



V2,5/fl 



n 



M,0 



SR , 



(25) 



where 



Our N-body simulation is based on the Particle-Mesh 
algorithm (hereafter PM), in which the particle densities 



P- P 



(26) 



The quantity c is the speed of hght in code units, i.e., 
c = c/vq. Similarly, we transform Eq. pop into code 
units and obtain 



V^ 



n 



Mfi 



25 



6 



(27) 



From this point on, wc work in code units unless other- 
wise specified, and we drop the tilde from code variables. 



B. Solving for fn and <j) 

The field equation for fn is a non-linear elliptical par- 
tial differential equation because the function Rifn) is 
non-linear in general. Because of this non-linearity, stan- 
dard spectral methods (e.g., FFT method) are not appli- 
cable and we therefore resort to a non-linear relaxation 
method for the solution. Given a general partial differ- 
ential equation. 



Hu) - /, 



(28) 



where L(u) is a non- linear differential operator on the 
field u and / is the source term, let us define 



C{u) = L{u)-f. 



(29) 



The problem of finding the solution, u, for Eq. ([28|) be- 
comes a problem of finding the root of C{u). The root- 
finding is carried out by an iterative Newton-Raphson al- 
gorithm, which defines the relaxation method [see, e.g.. 



old 



_c(?o_ 



(30) 



The only remaining difficulty is determining the appro- 
priate form of £(?i) and its discretization. For our model 
of f{R), the differential operator C{fR) is defined as 



C{fn) = 7f-VV. 



RUb) - R 



n 



+ 6. 



M,0 



(31) 



A naive and straightforward discretization of Eq. [^5] 
suffers from severe numerical instability. The differential 
operator in Eq. [31] is not defined for non-negative values 
of fa (see Eq. [T2|) , and thus the iterative solution fails as 
soon as an iteration oversteps the true solution into the 
forbidden region. We find that placing an artificial ceiling 
on the value of fa does not adequately solve the problem 
because the stability becomes conditional on very precise 
tuning of the ceihng due to the steepness of Rifn) near 
fn = 0. The best way to avoid the problem in this model 
is to redefine the field as 



Jb. = fBe"" 



(32) 



and solve for u instead of /r. With this substitution, 
the new field u has infinite domain and does not suffer 



from abrupt instabilities. The differential operator thus 
becomes 



>C(.fe) 



ac 



n 



M,0 



■/flV • (e"Vu) - 



RifBe^ - R 



S. (33) 



In our implementation, the differential operator is dis- 
cretized as a variable coefficient Poisson equation, in 
which the value of u at a grid cell identified by i,j, and 
k is updated using the red-black Newton-Gauss-Seidel 
(NGS) scheme [2j|- The actual discretization is too long 
to be displayed in the body of the text, and is instead 
placed in Appendix [X] We use periodic boundary condi- 
tions to close the NGS scheme at the edges of our simu- 
lation boxes. 

One potential problem with this substitution is the fact 
that we added additional non-linearity into a problem 
that is already highly non-linear. In particular, the addi- 
tional non-linearity is exponential, and thus the errors in 
the discretized solution, Mij-,fc, are potentially enhanced 
exponentially. In our experience running cosmological 
simulations with f{R) modification, we find that the ef- 
fects of this addition are negligible in all cases and the 
iterations converge to the same solution whether this sub- 
stitution is made or not, while avoiding the singularity 
of the curvature field at ff> ~ 0. 

Given the solution to ff> using the NGS algorithm and 
the density field, we can compute the solution to using 
Eq. 1271 The equation is a standard linear Poisson equa- 
tion and thus is solved efficiently using the fast Fourier 
transform method. We adopt the cloud-in-cell density 
interpolation algorithm (hereafter GIG), which treats a 
particle as a uniform density cube that is equal in size to 
the underlying grid cell, to compute the density field. 



C. Multigrid Acceleration 

In general, relaxation methods for solutions of par- 
tial differential equations are very inefficient and com- 
putationally expensive. However, this inefficiency can 
be mostly alleviated by the use of multigrid techniques. 
The multigrid method [2J| is a class of algorithms to 
solve partial differential equations using a hierarchy of 
discretizations with increasingly coarse grids. In particu- 
lar, it exploits the fact that a typical relaxation schemes, 
such as the NGS scheme previously detailed, smooth out 
the high frequency residual errors faster than the low 
frequency error modes. Thus, by interpolating the low 
frequency error modes onto a coarser grid, the same er- 
ror mode now appears as a high frequency mode on the 
coarser grid and can be efficiently reduced by relaxation 
passes on that grid. 

In our working code, we implement a particular type of 
multigrid method called the Full Approximation Scheme 
[25l . hereafter FAS], which is particularly suited for non- 
linear boundary value partial differential equations. In 
order to define the FAS algorithm, we first define two 
grids, the fine grid {flh), and the coarse grid {i^2h)- In 



addition, we define two inter-grid operators, one that re- 
stricts a fine grid field onto a coarse grid, and one that 
interpolates a coarse grid field onto a fine grid. We label 
the former as If^ and the latter as /^^, and leave the 
details of their implementations to Appendix [Bl Finally, 
we denote the quantities defined on a coarse grid by a 
superscript 2h and the quantities defined on a fine grid 
by a superscript h. 

The two grid FAS scheme can be summarized by the 
following. Given the initial guess to the solution, u^ , the 
FAS algorithm can be summarized as follows: 

• Relax on f2/i (pre-snioothing) 

• Compute: f' = /f (/'' - L''{u'')) + L'^'^ilf^'") 
. Solve: L^'^iu^'^) = f' 

• Correct the solution: u'^^.^ = u'* + ll^'hi^^^ - ll^u''') 

• Relax on f2/i (post-smoothing) 

In a true multigrid method, the third step above is also 
carried out using a multigrid method, thus forming the 
recursive hierarchy of coarser grids. The recursion is 
stopped when the number of coarse grid cells per dimen- 
sion reaches four; the solution at this level is obtained 
by iterating NSG until convergence. All relaxations are 
carried out using the NGS scheme that is appropriate for 
the given coarse (or fine) grid. 

In our implementation, we use the full- weighted restric- 
tion operator (/^'*) and its adjoint, the bilinear interpo- 
lation as the prolongation operator {l2h)- ^^ practice, we 
find that the choice of the restriction and prolongation 
operators does not affect the stability of convergence for 
most typical cosmological simulation. However, we find 
the convergence rate when using full- weighting and bilin- 
ear interpolation to be better than other, simpler choices. 



D. Particle Dynamics 

Given the solution for Jr field, the Newtonian po- 
tential is computed by solving Eqn (|27[) . which is a 
linear Poisson equation with a non-standard (i.e., non- 
GR) source term. Because of its linearity, such equation 
can be solved efficiently using a fast Fourier transform 
method [FFT, see[l^. The forces on the dark matter 
particles are computed using Newton's equations in co- 
moving code coordinates. 



dx p 

da ha? ' 

dp _ V(/) 
da a 

Because the expansion history of our f{R) model is tuned 
to that of flat ACDM, a is computed simply as 



(34) 
(35) 



d = a-i/2 



n 



M,0 ■ 



^A.oa^, 



(36) 



where the over-dot denotes a derivative with respect to 
coordinate time in code units. The gradient of the grav- 
itational potential is computed using a standard finite 



difference scheme consistent with our grid density assign- 
ment operator, namely the GIG scheme. We evolve the 
particles using a second-order accurate leap-frog method 



IV. TESTS OF THE CODE 

Our N-body simulation is implemented in C-f-l- with 
shared memory parallelization using OpenMP. In the fol- 
lowing subsections, we present various tests of the imple- 
mentation to assess the correctness of the code as well as 
its computational speed. 



A. Analytically tractable case 

First, we construct a density field 5 that has a known 
fii solution to check that the code can recover the true 
solution. In ID, the field equation (|25p becomes 



d'^x 



VI 



M,0 



5R 



-6 



(37) 



which, after setting all constants to 1 and taking n = 1 
and -R = 1, is 



d'^x 



i-h) 



-1/2 



1-35 



(38) 



We take the analytic solution to be 

^ . , Sttx 

JR = sm 

and substitute the solution into equation 
= I — I sm 



(39) 



to find 



1 
3 



27ra; 
2ttx 



-1/2 



(40) 



Thus, setting the density field to the expression in ((40|) 
and numerically solving for fji allows us to compare the 
code accuracy against the known solution, Eq. 1391 

In Fig. [1] we show the results of applying our code 
to this problem with three different grid resolutions. In 
the left panel, fractional discretization errors, defined as 
/i?,code//fl,exact " 1, are showu, and the residual errors, 
defined as C{u), are shown in the right panel. The av- 
erage discretization error decreases by approximately an 
order of magnitude for each two-fold increase in grid res- 
olution. Thus, for A'g ~ 512, we expect the discretization 
error to be on the order of 10"^, and therefore gives us a 
useful stopping criteria for solution convergence based on 
the requirement that the discretization error is the domi- 
nant source of error. We terminate the iterative multigrid 
process when the residual error (which can be computed 
without knowing the true solution) is smaller than 10"^", 
guaranteeing the residual error to be much smaller than 
discretization error for our highest resolution simulations. 
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FIG. 1: The fractional discretization error (left), shown as /H,codo//fl, exact — 1, and the fractional residual error (right), £{fji), 
for the analytically tractable case of i]IV Al The discretization error for the highest resolution box is ~ 10~^, and allows us to 
define a sensible stopping criteria for the multigrid iterations. The fractional residual error is floored by limits on numerical 
precision, which is ~ 10~^^. 



B. Point mass 



In this test, we place a point mass (consistent with the 
CIC scheme) at the origin of a periodic simulation box, 
with the corresponding density field given by 



j(i) 



lO-^iV^ if j 



-10 otherwise 







(41) 



where the indices i, j, and k, refer to the grid cell indices 
in X, y, and z directions, respectively. In the subsequent 
discussion in this section, we restore the tilde notation to 
distinguish code variables and physical variables. With 
this density configuration, outside the grid cell at the 
origin, we expect that 



V^J/h 



^5R 



(42) 



where meff is the eflcctivc field mass of the /^ field, given 
in physical coordinates by 



ml, = - 



1 fi + h 



Jrr 



-R 



o-Irr 



(43) 



for small fn. The quantity fjm is the second derivative 
of f{R) with respect to R. Thus, the solution to Eq. (142]) 
is a Yukawa like potential. 
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FIG. 2: The computed and expected solutions for the point 
mass test. The analytic solution is arbitrarily normalized such 
that it and the numerical solutions match at scales where the 
simulation is expected to best match the analytic solution. 
This scale corresponds to a; = lO/i^^ Mpc. The background 
effective fleld mass is 0.13/i Mpc~^. 
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FIG. 3: Orbit test. A point mass is placed at the center of 
a 128'' grid, and the trajectory of test particles are simulated 
without f{R) modification. The inner particle is placed at 
two grid cells away from the center, while the outer particle is 
placed seven grid cells away. Both particles are given initial 
velocities such that in they would be in circular orbit without 
f(R) modifications. 



FIG. 4: Zeldovich ID plane wave collapse with no f(R) modi- 
fications. Particle momenta (p) is plotted against particle po- 
sitions (x). The points represent the simulation output, and 
the line represents the exact solution given by the Zeldovich 
approximation. The snapshot is taken at a = across ~ 1. 



where 



"loff = \/3 



n 



M,0 



ac 



-Weff 



(45) 



is the effective comoving field mass in code units. 

Fig. [21 shows the fa field solution computed on a A^g = 
256 grid. The simulation parameters were: Lbox = 400 
Mpc^ flA/.o = 0.3, Qa^o = 0.7, h = 0.7, and fR,o = 
— 10~^. On the same plot, the analytic solution with 
the expected rhcs is plotted. The ~ 10% discrepancy at 
X ^ 2h~^ Mpc is due to the fact that the approximation 
S/r is only valid for small S/r and thus is 



^cff 



SR 

only appropriate at large distances away from the central 
near point mass. Hence, the Yukawa solution in Fig. [5] 
is normalized such that the solution best matches the 
simulation at a; ~ 10 h~^ Mpc where the S/r becomes 
sufficiently small. At x > 40/i~^ Mpc, the solution Sfu 
approaches the level of discretization error for Ng = 256 
(~ 10~^) and hence begins to show deviations from the 
analytic solution. 



C. Dynamics 



As previously described in i jIIIDl the particle trajecto- 
ries are integrated using a second-order accurate leap- frog 
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FIG. 5: Power spectra computed from cosmological simula- 
tions without /(-R) modifications. The Smith et. al. fit is 
plotted solid line, showing good agreement with our simula- 
tions. The vertical lines, from left to right, are the parti- 
cle half-Nyquist scales, defined as fc^N = 7rA'j,/(4Z/box), for 
Lbox = 256, 128, and 64 /i"^ Mpc boxes. 



scheme. In Fig.[3l we show the orbits of test particles or- 
biting around a central point mass. The test particles are 
placed at two and seven grid cells away from the point 
mass and are given initial velocity such that they would 
be in circular orbits if there were no integration errors 
and no f{R) modifications. The solid lines show the sim- 
ulated orbit for the two different radii. Far away from the 
point source, where the forces are adequately resolved, 
the orbit is nearly circular as expected for unmodified 
GR. The smaller orbit, with a theoretical radius of two 
grid cells, is not well resolved by the underlying potential 
grid. However, the simulated orbit is still nearly circular, 
with only a few percent deviation after two complete or- 
bits. The two-body force law, and its modifications due 
to f{R) models, will be further studied in W Al 

Another popular test of the particles dynamics as well 
as of the accuracy of the potential solver is the Zeldovich 
pancake collapse test [22, 123, 123 ■ The evolution of a ID 
sine wave is described exactly by the Zeldovich solution, 



Xi{a) = Qi + AG{a) sin. 



2Trqi 



^box 



Pi{a) = Aa^ G {a ~ Aa/ 2) sin 



J-ihox 



(46) 
(47) 



where qi = lAx, G{a) is the linear growth function, Aa is 
the time step used in the simulation, and A is a normal- 
ization constant that fixes the time at shell crossing. By 
setting the initial conditions at a = ainit according to the 
Zeldovich solution, we can compare the subsequent parti- 
cle evolution to the same solution to gauge the accuracy 
of our particle dynamics. 

For our test, we use rijvi.o = 0.24, flAfi = 0.76, h = 
0.73, and no f{R) modifications. The simulation box is 
set to Lbox = 100 h~^ Mpc. and we use 64"^ particles and 
64'^ grid cells. Fig. 2] shows the particle momenta versus 
the particle positions at the time of shell crossing, which 
is set to be a = 1. The dots denote the simulation output 
and the line represents the exact solution given by the 
Zeldovich approximation. As expected, the agreement 
between our code and the Zeldovich solution is good, with 
maximum deviation less than 1%. 

As another check of the particle dynamics, we have 
run several cosmological simulations without f{R) modi- 
fications to check our results against the Smith et. al. 
fits to the power spectrum [29[. We run three simu- 
lations with box sizes (Lbox) of 256 h~^ Mpc (L256), 
128 h-^ Mpc (L128), and 64 h'^ Mpc (L64). All sim- 
ulations have the following cosmological parameters re- 
scmbliiig the third year Wilkinson Microwave Anisotropy 
Probe [nl, WMAP3] results: I^m.o = 0.24, f^A.o == 0.76, 
n^Q = 0.04181, Ho = 72 km/s/Mpc, and erg = 0.76. We 
take the slope of the primordial power spectrum, n, to 
be 0.958. 

The initial conditions for the simulations are created 
using Enzo l30| , a publicly available cosmological N-body 
-|- hydrodynamics code. Enzo uses the Zeldovich approx- 
imation to displace particles on a uniform grid according 



to a given initial power spectrum. We use the initial 
power spectra given by Eistenstein & Hu [3l|], not in- 
cluding the effects of baryon acoustic oscillations. The 
simulations are started at 2 = 49. 

All simulations are run with 512 grid cells in each 
direction (i.e., TVg = 512) and with 256'^ particles. 
Thus, the formal spatial resolutions of the simulations 
are 0.5 h'^ Mpc, 0.25 h'^ Mpc, and 0.125 h~^ Mpc 
for the largest, middle, and smallest boxes, respectively. 
The corresponding mass resolutions are 2.76 x 10^^ Mq, 
3.45 X 10^° Mq, and 4.31 x 10^ Mq. 

For each simulation box configuration, we run multiple 
simulations with different realization of the initial power 
spectrum in order to reduce finite sample variance. Two 
L256 simulations, three L128 simulations, and four L64 
simulations were run. 

In Fig. [51 we show the power spectrum of our simula- 
tions as well as the Smith et. al. fit. Multiple simulations 
of the same box size are averaged in the figure. The ver- 
tical lines denote the half-Nyquist scale of the particles, 
plotted as a rough guide to show the resolution limits 
of our simulations. We see from the plot that our sim- 
ulations match the Smith et. al. fit to within ~ 10% 
at most scales smaller than the half-Nyquist scale. The 
large scatter at low k is seen because only a small number 
of the largest wavelength modes can fit in a simulation 
box. 



D. Time derivative 

In our simulation, we work in the quasi-static limit 
of the full field equation for fn, Eq. I5l in which we ne- 
glect the time derivative term. There is no simple way 
to check the validity of our assumption short of running 
a full simulation including the time derivative term; nev- 
ertheless, we can check that our assumption is at least 
self-consistent by running modified cosmological simu- 
lations with quasi-static assumption and comparing the 
time derivative of the fu field to its spatial derivative. 

We use the same simulation setup as the cosmologi- 
cal simulations in §IV CI except for the inclusion of f{R) 
modification with fjiQ = — 10~^. The time derivative of 
the field is estimated by finite differencing of fields at 
successive time steps. In Fig. [6l we show the fraction of 
the 2-norms of the time derivative and spatial derivatives 
of the field at different scale factors. Although the ratio 
varies by time, we see that the time derivative is generally 
much smaller than than the spatial derivative and thus 
that our simulations are consistent with our assumptions. 



E. Code timing and memory requirement 

Even with the efficiency of the multigrid method, solv- 
ing a 3D non-linear partial differential equation with rea- 
sonable resolution is still a computationally expensive en- 
deavor. In Fig. [71 we plot the residual error as a function 
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FIG. 6: The ratio of the time derivative of the /h field to 
the spatial gradient. The 2-nomi is defined simply as ^^ {x'^), 
where x is either 9^/ij or V^/jj. The factor of c? comes from 
the definition of the FRW metric. 
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FIG. 7: The residual error, defined as the 2-norm of the l.h.s 
minus the r.h.s of Eq. [42] divided by the 2-norm of l.h.s, is 
plotted as a function of CPU time spent in the iterative partial 
differential equations (PDE) solvers. The NGS scheme is only 
efficient at reducing low wavelength error modes and thus the 
convergence rate stalls as the large wavelength mode begins to 
dominate the error budget. The multigrid scheme can reduce 
error modes of all sizes equally well, and thus retains good 
convergence rate throughout the iterations. 



of the CPU time spent during the NGS and multigrid 
iterations of the point mass solution studied in ijIVBI 
The computations were carried out on a 2.8 GHz AMD 
Opteron 2220 processor without using OpenMP multi- 
threading (i.e., single threaded). For the NGS method, 
the high frequency error modes are reduced more ef- 
ficiently than low frequency modes. Thus, the resid- 
ual error goes down fast during the initial few hundred 
seconds of the NGS iterations, until the low frequency 
error modes dominate the overall residual error bud- 
get. Thereafter, the iterations converge slowly, rendering 
the method unacceptable for cosmological simulations in 
which the fn field must be evaluated hundreds of times. 
With the multigrid method, because of its use of coarse 
grids, error modes of all frequencies decay at nearly the 
same rate, and thus the overall convergence is greatly 
accelerated. Improved convergence rates using multigrid 
method, sometimes by as much as two orders of magni- 
tudes, makes fa cosmological simulations a reality. 

As shown in Fig. [3 convergence to a reasonable er- 
ror tolerance takes ^ 1000 s on an AMD Opteron 2220 
processor even for a modest Ng = 256 resolution. The 
FAS scheme has one of the best computation time scal- 
ing, growing linearly to the number of grid points (FFT, 
for comparison, scales as NlogN), and thus a TVg = 512 
simulation will converge in 10,000 s. The fji field equa- 
tion must be solved for each individual time step of the 
simulation, and thus resulting in potentially thousands 
of multigrid solutions totaling a few million seconds of 
CPU time. With our discretization scheme, we find that 
the simulation code spends ^ 95% of CPU time solving 
for the fn field. Such percentage is at best a rough esti- 
mate, as the number of multigrid iterations required for 
convergence depends on the setup of the simulations. 

Fortunately, several methods to speed up the simula- 
tion are possible. First, the particle positions between 
successive time steps does not change greatly, and thus 
the fu solution should not change significantly as well. 
The fn field solution from the previous time step provides 
a good initial guess for the iterative solver and we find 
that it results in ~ 10% faster convergence. Secondly, 
the FAS scheme is easily parallelized using shared mem- 
ory parallelization with almost perfect scaling with the 
number of processor cores available. The NGS smoothing 
used extensively within the FAS scheme can be arranged 
in the so called red-black ordering [23| , such that smooth- 
ing of one particular grid point is independent of all other 
points except for its six nearest neighbors. This locality 
allows the smoothing to be split into local domains, each 
of which can be computed on separate processors. Com- 
bining such acceleration methods, a typical iVg = 512 
simulation can be carried out in roughly 10 days on our 
hardware. We see no difficulties in extending the par- 
allelism to distributed memory, although wc have not 
implemented such extension. 

Memory requirement for the FAS scheme can be high 
compared to traditional PM simulations. The hierarchy 
of coarser grids add ^ 15% to the storage requirement for 
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FIG. 8: Two-body acceleration as a function of separation 
for several /_r field strengths. The accelerations are shown as 
relative differences from the exact Newtonian two-body accel- 
eration. The large scatter at a:: ~ 2 is because of truncation 
errors resulting from differencing the potential at scales com- 
parable to the grid size. 



FIG. 9: Two-body acceleration as a function of separation for 
several /_r field strengths. The spherical blob has a constant 
density of (5 ~ 500 and a radius of 20 grid cells (15.6 h~^ Mpc). 
The increased scatter in the accelerations at the surface of the 
sphere are due to the fact that a smoothness of the spherical 
surface is limited by the finite grid size of the simulation. 



the ffj, field eompared to the normal storage requirement 
for a 3D grid scaling as N^. Furthermore, in order to 
accelerate the computation, we store the current residual 
error and use one additional grid hierarchy of the same 
size for transient computations. For a A'g = 512 grid, 
the required nremory for our FAS inrplementation is ^ 
3.5 GiB using double precision floating point format (64 
bits each). For comparison, storage of the density field 
requires 1 GiB, and the storage of dark matter particles 
requires 6 GiB for 512'^ particles. 



V. RESULTS 

In this section, we study a few /a simulations in or- 
der to highlight the differences between /^ and ordinary 
ACDM physics. A more comprehensive study of the non- 
linear fj^ physics is detailed in a companion paper |32| . 



A. Two-body force laws 

The first interesting case is the study of two body force 
laws in fn cosmology. As is well known, in ordinary GR, 
two-body forces decay as r~^ as the separation between 
the bodies, r, is varied. However, in //j cosmology, the 
new field mediates a fifth force that modifies two-body 



interactions and potentially change the large scale struc- 
tures of the Universe. 

Such deviations from GR are easy to see using the 
same setup as the near point mass text in § IIVBI We 
solve for the peculiar gravitational potential, 0, with the 
fa field and compute the gravitational acceleration at 
random positions in the simulation box, labeling them 
gf(R)- In addition, we conrpute the standard Newtonian 
force using the inverse square law for the same set of 
points, labeling them gNowton- We use a 400 h~^ Mpc 
computation box with Ng = 256, ^Im ~ 0.3, and f2A = 
0.7. Fig. [5] shows the relative differences between gf(R) 
and 5Ncwton as a function of the distance away from the 
point mass. For strong fa fields, the gravitational forces 
near the central point mass are enhanced by as much as 
~ 30% over the unmodified forces. The range of the force 
enhancement is determined by the Compton wavelength 
of the ffj field, which depends on the local field mass as 



A. 



cff 



l/i 



|(n+2)/2(n+l) 



(48) 



Thus, the stronger the field, the farther out the gravi- 
tational enhancement of the fn field propagates, which 
is confirmed by Fig. [51 For the very weak field with 
fa = — 10~^, the Compton wavelength is much smaller 
than the computational grid cells and therefore the grav- 
itationally enhanced regions are not resolved. 

In the strong field limit, the large Compton wavelength 
of the fn field makes it smooth and stiff, resisting the 
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fluctuations in the underlying density fleld. Thus, in this 
hmit, the deviation of the field, and hence of the curva- 
ture field -R, vanishes and we have SR « 0. The equation 
for the gravitational potential, Eq. [121 reduces to 



V^ch == 



IGttG , in 



(H 



Sp. 



(49) 



We see that the result is an effective enhancement of 
the Newton's constant by 33%, and hence the two-body 
forces enhanced by the same amount. Our two-body 
numerical solution captures this effect well, with the 
/flo = —10^'^ acceleration curve plateauing at the cor- 
rect value. 

Features not due to the presence of the fn field are 
also seen in Fig. [H The large scatter near a; ~ 2 are due 
to the lack of force resolution at scales comparable to the 
underlying grid cells. At scales even smaller than the grid 
cells, computed forces vanish; this force decay is essential 
for the N-body system to remain collision- free. On the 
other end of the scale, the periodic boundary condition 
begins to affect the forces as the test masses becomes 
gravitationally attracted to the neighboring point mass. 

At first, Fig. [H] seems to contradict the stringent solar 
system tests carried out so far and would constrain ffj 
to very small values. However, the particular fa model 
chosen in this study shows the so called chameleon be- 
havior [Ij, |lj] , in which the field strength is pushed to 
near zero in high density regions. Small field strength 
in the high density region causes the effective field mass 
to increase and therefore the Compton wavelength to de- 
crease. Thus, in our solar system, which sits in a rela- 
tively high density region of the Universe, the chameleon 
behavior may cause the fj^ field to be undetectable using 
our current technologies. 

Our code is expected to reproduce this non-linear 
chameleon behavior that is not incorporated into the 
linear analyses carried out so far. In order to investi- 
gate such effects, we place a spherical top-hat of con- 
stant density at the center of a 200 Mpc simulation box. 
The sphere is 20 grid cells in radius, corresponding to 
15.6 Mpc, and has an uniform over-density oi S ^ 500. 
Random test masses are placed both inside and outside 
the sphere, and the accelerations of the masses are cal- 
culated using the exact solution (^Newton) and Ir sim- 
ulations with various field strengths ((7/(_r)). The exact 
solution of the acceleration increases linearly within the 
sphere and decreases as r"^ outside the sphere. 

In Fig. [51 wc show the relative difference between gf[R) 
and gNcwton as a function of the separation between the 
center of the spherical blob and the test mass. Outside of 
the blob, the gravitational force is enhanced as expected 
from the results of Fig. [51 However, inside the blob, the 
fn field quickly dies down and the gravitational forces 
converge to that of ACDM. Strong fn fields can penetrate 
the density field farther, as seen in the case of l/ijj = 10"'^ 
that decays in ~ I0h~^ Mpc. 
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FIG. 10: The relative differences in particle momenta in Zel- 
dovich pancake simulations with varying fit field strengths. 
The simulation snapshot is taken at a = across ~ 1- 



B. Revisiting the Zeldovich pancake 



The Zeldovich ID plane wave collapse offers insights 
into the differences in the particle dynamics in the pres- 
ence of f{R) modifications. We run three Zeldovich pan- 
cake simulations starting with the same cosmology and 
initial conditions as in Fig. [H except for f{R) modifica- 
tions. In Fig. 1101 we show the resulting difference in par- 
ticle momenta relative to those of unmodified GR sim- 
ulation. Because of the fifth force due to Jr, the sine 
wave collapses faster as the field strength is increased. 
In addition, we see the effects of the finite range of the 
fifth force, which is determined by the Compton wave- 
length of the fa field, Eq. [151 In the case of weak field, 
fjiQ = —10^*, there is no significant enhancement in the 
particle momenta at distances greater than ~ 10 code 
units (^ 16/i^^Mpc) away from the collapsed pancake, 
and the range of the fifth force is shown to be larger for 
stronger fn fields. For the strongest field, the fifth force 
does not decay sufficiently before the periodic image of 
the pancake starts to become important. 

The Zeldovich collapses show no sign of the chameleon 
behavior, even for the weakest fa field. The pancake is 
essentially a sheet mass and thus does not have an ex- 
tended region of high density. Therefore, the fa field 
does not appreciably decay near the pancake and no 
chameleon behavior is expected. 
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Fig. [TT] shows the relative differences between the 
power spectra with and without f{R) modifications, 
along with the linear prediction for the power spec- 
trum enhancement taken from [12| . The linear predic- 
tion assumes no chameleon behavior and thus is expected 
to overestimate the power spectrum enhancement when 
there is strong suppression of fa field in high density 
regions. The simulated power spectra match the linear 
predictions at large scales, where non- linear effects, such 
as the chameleon mechanism, is expected to be negligi- 
ble. The smallest simulation box, Lbox = 64:h~^ Mpc, 
does not have sufficient statistics to accurately represent 
the largest modes. At scales smaller than k ~ O.lh Mpc, 
we see a striking deviation of the simulations from the 
linear prediction, showing clear signs the previously un- 
quantified effects of non-linear /(i?) physics. 

More detailed comparison of the power spectra are car- 
ried out in the companion paper. 
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FIG. 11; Relative differences between the power spectra of 
|/flo| = 10~* simulations and unmodified ACDM simulations. 
The linear prediction is taken from Hu & Sawicki [12| and as- 
sumes no chameleon behavior. The power spectra for each box 
size are only plotted up to their respective half-Nyquist wave 
number, which are 0.79/i Mpc~\ 1.57h Mpc~^, and 3.14/i 
Mpc~^ for L256, L128, and L64 simulations, respectively. 



C. Cosmological PoAver Spectrum 

Finally, we apply our code to a realistic cosmological 
simulation in order to study the effects of the fn field to 
the large scale structures. We have run f{R) cosmolog- 
ical simulations using the same setup as those found in 
give] and with fm = -10"^ The strength of the Jr 
field is chosen to highlight the differences in the power 
spectra. 

With our simulation setup having twice as many grid 
points as particles per one dimension, we find that at the 
initial redshift, some of the grid cells in the simulations 
have no particles nearby and therefore have zero density. 
At early redshifts, the cosmology is very nearly ACDM, 
and we have R = SirGp. Thus, in the empty grid cells, 
the curvature is almost exactly zero and we have a formal 
divergence of the fn field in such grid cells. The particle 
dynamics, however, is governed by the peculiar potential, 
and therefore by the curvature and density fields, not di- 
rectly by the //j field (see Eg. [27)1 . We have checked that 
our potential solver and particle trajectory integrators 
remain stable and accurate even when such divergences 
are present (see Fig. [3]). At late times, we also find empty 
grid cells within large voids. However, the mean Comp- 
ton wavelength of the f^ field becomes sufficiently large 
and smoothes out the fn field, thereby avoiding any po- 
tential divergences in the field at late times. 



VI. CONCLUSION 

In this paper, we have introduced a method to con- 
struct a cosmological N-body simulation with a class of 
viable f{R) modifications to general relativity. Our simu- 
lation code, implemented in standard C-|--|-, solves the HS 
model of f{R) modification using particle mesh method, 
particularly suited because the scalar degree of freedom 
in the model is well represented as a scalar field (i.e., 
a mesh) coupled to the Newtonian potential. The non- 
linear field equation for fn is solved on the mesh using 
an efficient multigrid scheme that is shown to converge 
robustly to the desired solution. The code is also shown 
to reproduce the chameleon behavior of the HS model. 

The code should allow us to gain insight into the quasi- 
nonliear physics of f{R) modified gravity models. In 
general, gravitational forces are enhanced under the HS 
model, and we show that the resulting power spectra are 
also enhanced compared to the ACDM power spectrum. 
We further show hints that the power spectrum enhance- 
ment is smaller than the linear predictions due to the 
chameleon mechanism that has not so far been modelled. 
In a companion paper [34I, we further study the effects 
of the HS model on the dark matter power spectrum 
and its cosmological implications. In addition, effects on 
the cluster mass function and weak lensing statistics are 
potentially interesting and should be investigated in the 
future. 

Lastly, the work presented here can be used as a gen- 
eral framework for non-linear simulation of any mini- 
mally coupled scalar field model of gravity, provided that 
the resulting field equations are sufficiently stable under 
relaxation and multigrid schemes. In particular, an in- 
teresting exercise might be to apply this framework to 
the Dvali, Gabadadze, and Porrati (DGP) model |33| . 
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APPENDIX A: DETAILS OF THE 
DISCRETIZATION 



In our implementation, Eq. ()33|) is discretized much 
like a standard variable coefficient Poisson equation. We 

let a = e", a^+ij^k == (ajj,fc + aj+ij\fc)/2, and a,_i_j-^fc == 
(aij.k + ai-ij,fc)/2. Then, we have 



C = -r 



-f^ [^^~U^ 









RifR^Uk) - R 



^i.j,k 



(Al) 



Wc find that the above discretization performs better 
than discrctizing the equation after expanding out the 
derivative operators. Even using this discretization, we 
find that the iterations can diverge if the initial guess is 
sufficiently poor. We find that using variable, and of- 
ten smaller, Newton- Raphson step sizes (i.e., successive 
■wnder-relaxation) can effectively avoid such divergences 
until the trial solution is sufficiently close to the true so- 
lution. 



APPENDIX B: FAS IMPLEMENTATION 

In our FAS implementation, we use the full-weighting 
restriction operator {ifj'') and the bilinear interpolation 
operator (/2h) for prolongation. The full- weighting oper- 
ator, in 3D, is given by 



where 



2h 



E 

a,/3,7— - 



'^Q,/3,7'^2'i+a,2j+/3,2A;+7 



(Bl) 



0'q,/3,7 
Ca,/3,7 
Cq,/3,7 



1 

16 
1 

32 
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64 



if |a| - 
if |a| 
if |a| 
if lal 



+ |7| = 

^l + l7|-2 
^1 + l7l - 3. 



Note that the 2h superscript denotes a variable defined 
on the coarse grid (i.e., a grid with spacing of 2h) and 
the h superscript denotes a variable defined on the fine 
grid. The bilinear interpolation operator is given by 
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Other choices of restriction and prolongation operators choices result in the fastest and most stable FAS imple- 
are certainly possible. However, we find that the above mentation for this particular problem. 
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